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1.  Introduction 


In  recent  years,  organic  molecules  have  garnered  increasing  attention  as  components  of 
high-hyperpolarizability  materials,  partly  due  to  the  variety  of  synthetically  accessible 
compounds  (1,  2).  Applications  for  materials  with  high  hyperpolarizabilities  are  found  in 
telecommunication  ( 3 ).  The  nonlinear  response  of  organic  molecules  often  finds  its  origin  in 
the  conjugated  7r-system,  which  facilitates  the  electronic  polarizability.  The  design  of  such 
molecules  in  silico  is  complicated  by  the  fact  that  chemical  space,  even  constrained  to 
smaller  organic  compounds,  is  combinatorially  complex.  The  number  of  organic  molecules 
of  medium  size  is  estimated  to  be  on  the  order  of  lO200  (4).  Enumeration  is  therefore 
unfeasibly  costly  and  other  methods  for  property  optimization  need  to  be  developed. 
Including  conformational  searching  further  complicates  molecular  design. 

Methods  for  optimization  in  discrete  spaces  have  been  studied  extensively  and  recently 
reviewed  (5).  Optimization  methods  include  integer  programming,  as  in  branch-and-bound 
techniques  (including  dead-end  elimination  [£]),  simulated  annealing  (7),  and  genetic 
algorithms  (5).  These  algorithms  have  found  renewed  interest  and  application  in  molecular 
and  materials  design  (9-12).  Recently,  new  approaches  have  been  explored  to  embed 
discrete  chemical  space  in  continuous  spaces  to  take  advantage  of  continuous  optimization 
techniques.  These  include,  in  particular,  activities  in  our  group  on  the  linear  combination 
of  atomic  potentials  (LCAP)  (13-15)  method  and  the  approach  of  von  Lilienfeld  (16-18), 
using  a  grand-canonical  ensemble  strategy.  Here,  we  further  employ  continuous 
optimization  methods  aimed  at  discovering  structures  with  optimal  properties. 

The  problem  of  discrete  optimization  in  chemical  space  can  be  tackled  by  embedding  the 
discrete  space  in  a  virtual  continuous  space,  parameterized  by  a  set  of  continuous  variables. 
This  strategy  establishes  a  continuous  path  from  one  molecule  to  another.  Such  a  space 
can  be  constructed  by  defining  molecules  as  a  succession  of  replacements  of  an  atom  or 
molecular  fragment  by  another.  These  fragment  or  atom  placements  need  only  satisfy  the 
rules  of  valency.  For  example,  a  hydrogen  in  CH4  might  be  replaced  by  a  halogen  or  a 
methyl  group,  each  corresponding  to  a  specific  geometry  (or  ensemble  of  geometries), 
energy(ies),  and  property  value(s).  It  is  possible  to  construct  a  continuous  transition 
between  Hamiltonians  for  the  chemical  structures  as  was  done  for  LCAP  (13).  Equation  1 
illustrates  the  procedure. 


H( A)  =  ^  A iHi,  Ai  =  1,  0  <  A,  <  lVi 

i 


(l) 


Each  Hamiltonian  Hi  acts  only  on  its  own  molecular  subspace  Q,;  projecting  all  other 
wave-functions  out,  and  H  acts  on  the  direct  sum  of  these  spaces  07-  Q,.  In  equation  1,  the 
summation  constraint  implies  the  mutual  exclusivity  of  the  groups  in  the  library  (e.g.,  in 
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the  previous  example,  as  the  hydrogen  component  increases  toward  1,  the  halogen 
component  decreases  toward  0).  In  this  approach,  the  groups  are  still  linked  through  the 
wave  function.  Therefore,  it  is  possible  that  all  optima  are  at  non-physical  configurations 
(e.g.,  half  hydrogen  and  half  halogen  in  the  same  location).  Starting  with  each  A*  E  {0, 1}, 
it  is  possible  to  compute  the  numerical  derivative  of  a  property  P.  We  now  explore  the 
application  of  this  idea  for  discrete  optimization  of  the  first  hyperpolarizability. 


2.  Methods 


2.1  Linear  Interpolation  of  Discrete  Spaces 

Analogous  to  LCAP  optimization,  any  property  can  be  interpolated  in  a  virtual  continuous 
space.  We  call  the  interpolated  space  “virtual”  since  non-integer  A^-values  correspond  to 
intermediate  or  “alchemical”  species.  In  general,  given  a  library  with  N  molecules  with 
property  values  Ps  for  molecule  s,  log2  N  variables  may  be  used  to  embed  the  discrete 
library  in  the  continuous  space  [0, 1] lof?2  N .  For  example,  assume  a  library  consisting  of 
methane,  ethane,  propane,  and  butane  in  exactly  that  order  (figure  1).  It  is  possible  to 
interpolate  among  the  four  molecules  using  the  parameters  A0  and  Ai.  A  (quadratic) 
polynomial  interpolating  the  ground  state  energies  (for  example)  is 

E( A0,  Ai)  =E0(1  -  A0)(l  -  Ar)  +  £iA0(l  -  A0+ 

■£2(1  —  Ao)Ai  +  .E3A0A1 

This  energy  equation  has  a  well-defined  minimum.  Interpolation  using  a  single  variable  for 
this  set  of  compounds  would  produce  a  third  degree  polynomial,  but  homogeneous 
solutions  to  third  order  polynomials  are  not  trivial,  and  the  optimum  is  not  guaranteed  to 
correspond  to  a  molecule,  i.e.,  A  E  {0, 1,  2, 3}. 


Molecule 

s 

A1A0 

ch4 

0 

0,0 

C-Jh 

1 

0,1 

C3Hs 

2 

1,0 

c4h10 

3 

1,1 

Ai 


1.0- 


-1.1 


00 - 01 


Figure  1. 


Simple  example  for  interpolation.  The  bits  A1A0  represent  the  molecule  number  s  =  2Ai  +  Ao 
in  the  binary  system. 
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The  preceding  example  highlights  the  dependence  of  the  property  polynomial  on  the 
ordering  of  the  molecules.  Generalization  of  the  example  to  a  library  C  of  size  N  leads  to 
equations  3  and  4.  Equation  3  describes  the  bit-string  (binary)  representation  of  a  number 
s  with  bit  s(i)  at  the  zth  position: 

s  =  ^s(z)-2\s(z)e{0,l}  (3) 

i 

JV  — 1  log  2N 

P  (A)  =  Y,  P-  n  ((1  -  (4) 

s=0  6=1 

Equation  4  defines  the  property  interpolation  P  based  on  the  bit-strings.  We  differentiate 
between  P  and  P  to  emphasize  the  domain  of  definition.  The  former  is  defined  on  the 
“virtual”  space  [0,  l]log2iV,  while  the  latter  is  defined  on  the  discrete  space  C.  This 
polynomial  of  the  same  order  as  it  has  variables  (log2  N)  is  continuous  on  [0,  l]10^-^. 

2.2  Derivatives  of  P 


In  order  to  use  conventional  optimization  algorithms  on  continuous  spaces,  it  is  necessary 
to  find  the  derivatives  of  P. 


dP 

d\j 


(A)  =  £  P,(-1)‘U) 

s=0 


log2  N 

[  ((l-Aj'WA ;-*m; 


d2P 

d\kd\i 


Ep*i 


log2  N 

-m  J=[  ((1 

b#{k,l} 


(6) 


Equations  5  and  6  show  first  and  second  order  analytical  derivatives  of  P.  The  derivative 
of  P  at  A  corresponding  to  the  molecule  with  number  s  in  the  library  C  can  be  computed 
from  nearest  bit-string  neighbors  {s^3\  s (feT): 


s(j)  =  s  +  (-l)s(j)-2s(j’)  (7) 

S(M)  =  s  +  .  2SW  +  (-l)s<0 . 2s(/)  (8) 

BP 

Ai  =  s(i),  ^(A)  =  (-1  )”W(P,  -  P,w)  (9) 

B2  P 

=  (-l)*'*1(-l)-<0(a  -  P.m  -  p,m  +  p,w),l  #  k,  A,  =  s(i)  (10) 


The  highly  nonlinear,  but  continuous  description  P  allows  the  development  of  optimization 
methods  by  substituting  derivatives  by  finite  differences  in  continuous  optimization 
methods.  In  this  case,  the  analytical  property  derivatives  for  a  molecule  are  computed  from 
simple  (finite)  property  value  differences,  unlike  in  LCAP.  The  derivatives  of  LCAP  need 
not  be  on  straight  lines  pointing  from  one  physical  (non- “alchemical” )  molecule  to  another, 
although  the  property  values  of  each  real  molecule  are  the  same  for  either  optimization 
scheme. 
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2.3  Comparison  with  Dead-End  Elimination 


To  compare  our  approach  (equation  4)  with  dead-end-elimination  algorithms  (DEE),  we 
consider  the  minimization  of  a  pairwise  additive  property  function  comprised  of  unary 
contributions  P-Il>  acting  on  site  i  with  occupation  /r  and  binary  contributions  acting 

on  sites  i  and  j  with  occupation  /r  and  v  (equation  11): 

p,  =  J2  pis{l)) + Y2  p^s{i)'s{j))  (ii) 

i  i<j 


Collecting  all  terms,  we  find  a  quadratic  dependence  of  P  on  the  pairwise  terms  Pjj  with 
the  parameters  A,.  Consequently,  the  derivatives  are  linear  with  respect  to  A  j  (equation  13): 


P(  A) 


dP_ 

a\i 


=y:(p«o,A,  +  p)(i-Ai))  + 

i 

E  (bP^  +  hPa  -  A<)A>+ 

i<j 

1  -  A,)  +  P'/A  1  -  A,)(l  -  A,)) 

=  ff  ’  -  />">  +  £  (( ~  PP%+ 

[Pt]  -  PiM)](l  -  A,)) 


(12) 


(13) 


From  equation  13,  a  pruning  argument  for  minimization,  reminiscent  of  DEE,  can  be 
derived.  Whenever  the  gradient  with  respect  to  a  parameter  A*  is  negative  for  all 
configurations  of  A  €  [0,  l]lo&2-^  (equation  14),  then  Aj  =  1  minimizes  P.  This  condition  is 
only  met  when  inequality  15  is  fulfilled: 


s(i)  =  1 


dP 

d\i 


<  0  VA j  €  [0, 1]  4» 


p(0)  _  pm  <  E  min{p.(i.«)  _  p. 


(1,1) 

ij 


-  Pl 

U 


(0,l)i 


(14) 

(15) 


Conversely,  a  positive  gradient  implies  that  Aj  =  0  (equation  16)  and  the  corresponding 
necessary  and  sufficient  condition  can  be  found  in  equation  17.  Thus  it  has  been 
demonstrated  that  P  naturally  leads  to  DEE-like  algorithms. 


dP 

s(i)  =  0  4=  7^-  >  0  VA  j  6  [0, 1]  4^ 
Pm  _  Pm  >  ma 

3& 


(16) 

(U) 
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2.4  Library  Construction  and  Ordering 


The  choice  of  enumeration  of  the  library  C  determines  the  assignment  of  specific  molecules 
to  A.  Consequently,  this  choice  greatly  influences  the  characteristics  of  P,  such  as  its 
smoothness.  Just  exchanging  the  position  of  two  neighboring  molecules  in  the  library 
changes  the  sign  of  the  derivative  at  the  corresponding  A.  If  the  Hessian  of  the 
pairwise-additive  property  function  is  positive-semi-definite,  the  corresponding  P  is  convex 
and  optimization  quickly  reaches  the  global  minimum.  Using  steepest  gradient  or 
Newton- Raphson  algorithms  locates  property  extrema  (minima).  It  is  beneficial  to  find  an 
ordering  of  the  library  that  produces  a  convex  property  surface.  The  linearity  in  each 
parameter  A i  implies  convexity  of  P  with  respect  to  that  parameter. 

Assuming  that  molecules  of  similar  structure  have  similar  properties,  a  measure  of 
similarity  may  be  used  to  decrease  the  ruggedness/convexity  of  P.  One  choice  to  facilitate 
smooth  property  surfaces  is  the  enumeration  of  molecules  by  subsequent  substitutions  from 
a  starting  compound  (figure  2).  The  substitutions  may  be  defined  recursively.  Each  level  of 
a  hierarchy  of  substitutions  consists  of  a  molecular  fragment  or  atom  to  be  connected  to 
the  next  higher  level,  a  list  of  substitution  sites  and  a  set  of  subsequent  levels  for  each  site 
(figure  2).  Each  element  of  the  set  of  subsequent  levels  is  identified  with  a  coefficient 
between  0  and  1,  and  the  sum  of  these  coefficients  for  each  set  must  equal  1  (see  equation 
18).  For  a  case  in  which  more  than  two  possible  substitutions  are  available  at  a  site,  the 
bit-string  representation  must  be  extended  to  allow  mixed  numeric  bases  bk ■  The  properties 
discussed  above  remain  unchanged  in  this  alternative  interpolation  (equation  20). 

Kj  =  1,  j  £  {0,  •  •  • ,  h  —  1}  (18) 

3 

S  =  ^2  ( J  y^g(bj)  • )  €  {0,  l},^s(i,  j)  =  1  (19) 

i  \fc= 0  /  jebi  j 

p  . 6,},.) = e  a<n  n  o  (») 

s=0  i  j= 0 
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Y  = 


Figure  2.  Substitution  pattern  hierarchy.  Y  contains  a  Z-matrix  that  has  several  open  valences.  The 
first  can  be  filled  with  substituents  found  in  Xi ,  which  are  connected  substituents  found  in 
X2,  etc.  The  second  is  filled  from  Xm  in  the  same  manner.  The  X,  themselves  are  taken 
from  a  set  of  substitution  patterns  of  the  same  kind  as  Y.  Each  instance  is  anchored  to  Y  at 
the  appropriate  valence.  The  substitutions  are  terminated  by  Z-matrices  that  have  no  open 
valences. 

2.5  Inclusion  of  Conformational  Complexity 

For  each  molecule,  it  is  important  to  find  low-energy  conformers  for  the  optimization  to  be 
physically  meaningful.  For  each  molecule  in  the  molecular  library,  another  optimization 
can  be  started  with  the  (second)  library  consisting  of  the  corresponding  conformers.  Each 
dihedral  degree  of  freedom  can  be  treated  as  a  substitution  site  at  the  lowest  level  with  a 
number  of  rotations  as  possible  substitutions,  as  is  commonly  done  in  conformational 
searches  (6,19).  In  this  manner,  the  conformational  search  can  be  introduced  as  the  lowest 
level  in  the  previously  described  substitution  hierarchy.  Thus,  the  conformational  search 
precedes  property  computation  in  property  optimizations.  More  general  constraints  on  the 
optimal  molecule  can  be  introduced  via  alternate  methods,  like  Lagrange  multipliers  or 
stochastic  algorithms.  Lagrange  multipliers  can  be  implemented  using  (soft)  penalty 
functions  with  weightings  that  increase  throughout  the  optimization. 

2.6  Algorithm 

Here,  a  line  search  algorithm  is  used,  in  particular,  each  parameter  A,  is  followed  to  a 
minimum  in  that  direction  before  varying  the  next  parameter  At+i.  Maximization  via  this 
algorithm  can  be  achieved,  for  instance,  by  minimizing  the  negative  objective  function. 
This  line  search  algorithm  is  an  implicit  branch-and- bound  algorithm.  A  flowchart  for  the 
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employed  recursive  algorithm  appears  in  figure  3  and  application  of  the  algorithm  to  a 
small  example  will  be  discussed  in  section  3  under  framework  A  (see  also  the  accompanying 
figure  6). 


1.  Initial  structure 


Figure  3.  Flowchart  of  the  algorithm. 


Since  P( A)  is  locally  convex,  this  algorithm  converges  locally.  The  line-search  steps  4-7  in 
figure  3  correspond  to  a  linear  tree  search  or  branch-and-bound  algorithm.  The 
computational  complexity  is  on  the  order  0(log  N)  in  the  library  size  N  due  to  the  linear 
dependence  on  the  log  N  variables.  In  contrast  to  conventional  branch-and-bound  methods, 
no  structures  are  explicitly  excluded  from  the  search  space.  Since  each  molecule  chosen  in 
step  8  in  figure  3  is  strictly  better  in  the  sense  of  property  optimization  than  its  predecessor, 
the  algorithm  quickly  converges  to  a  local  property  value  minimum  in  the  library  (20). 

All  property  minima  for  this  algorithm  are  minima  for  the  steepest-descent  derived  method 
and  vice  versa.  This  algorithm  traverses  the  library  in  a  smoother  fashion  compared  to  the 
steepest-descent  derived  method,  successfully  employed  by  Keinan  et  al.  (15),  because  the 
molecules  are  traversed  variationally  by  single  substitutions.  While  on  one  hand  the 
steepest-descent  based  approach  can  sidestep  barriers  in  the  immediate  vicinity  efficiently, 
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due  to  the  simultaneous  change  of  potentially  several  bits,  the  variational  nature  of  this  line 
search  guarantees  convergence,  which  is  particularly  useful  on  rugged  property  surfaces. 


For  the  sake  of  computational  accessibility,  all  geometries  were  optimized  using  the 
semi-empirical  Austin  Model  1  (AMI)  method  as  implemented  in  Gaussian03  (21).  The 
static  electronic  hyperpolarizability  was  computed  using  Intermediate  Neglect  of 
Differential  Overlap/Screened  Approximation  (INDO/S)  as  implemented  in  Complete 
Neglect  of  Differential  Overlap  (CNDO)  by  Reimers  et  al.  (22)  using  the  sum-over-states 
expression  in  equation  21.  The  configuration  interaction  (Cl)  space  was  spanned  by  up  to 
100  unoccupied  or  occupied  orbitals  to  accommodate  for  the  large  number  of  electrons  in 
some  of  the  investigated  systems. 


Pijk  —  ^  ] 


(0|  Xj \v)  (u\  Xj  -  jij  |k)  (k\  Xk  [0) 


Pi  —  o  £(/%  T  Pjij  T  fijji) 


ft.  =  At  •  A  A)  = 


3 


(21) 

(22) 

(23) 


where  Eqv  is  the  excitation  energy  from  the  ground  state  to  the  z/th  excited  state,  f3  is  the 
static  electronic  hyperpolarizability  with  components  and  corresponding 
hyperpolarizability  tensor  elements  fajk,  Po  is  the  isotropic  hyperpolarizability,  /3^  is  the 
hyperpolarizability  component  in  direction  of  the  ground  state  dipole  moment,  x  is  the 
dipole  operator  with  components  X*,  and  /2  is  the  ground  state  dipole  moment  with 
components  //,. 


Figures  4  and  5  summarize  the  tolane-based  system  studies.  Tolane  spectroscopic 
properties  are  favorable  for  applications,  so  their  first  and  second  hyperpolarizabilities  have 
been  studied  extensively  (23,  24).  In  addition,  these  structures  are  readily  modified  (25) 
and  present  a  large  number  of  possible  derivatives.  Tolanes,  therefore,  present  a 
particularly  rich  testbed  for  these  optimization  studies. 


Xi 


X2 


xfi 


X7 


o2n 


nr2 


Figure  4.  Tolane  framework  in  which  X,  and  R,  are  variable  substituents. 
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H  H 


A.  Y=  A — B' 


H  H 

SH,  NH2,  NMe2  };  Ag{  CN,  N02} 


-B — D;  B,  B'  G  {  o-,  m-, 


p-,  o'-,  m'-phenyl};  Dg{OH, 


B. 


X5  is  empty;  R=H;  Xi,X2,X3,X4,X6,X7,X8,X9  g  {CH2OCH3,  CH2OH,  CH2NH2,  NH2,  OH, 
N02} 


(1)  X1=X4=X7=X8=H;  A,  B,  C,  D,  X2,  X3,  X6,  X9  6  {CH2OMe, 
CH2OH,  CH2NH2,  NH2,  OH,  CHO,  N02};  R=CH3 

(2)  X1=X4=X7=X8=H;  A,  B,  C,  D,  X2,  X3,  X6,  X9  G  {H,  F,  Cl, 
’  Br};  R=CH3 

(3)  X1=X4=X7=X8=H;  A,  B,  C,  D,  X2,  X3,  X6,  X9  g  {CH2OMe, 
CH2OH,  CH2NH2,  NH2,  F,  Cl,  Br,  CHO,  N02} 


Figure  5.  Tolane  libraries  investigated. 


3.  Results  and  Discussion 


Overall,  five  different  tolane  libraries  were  investigated  (general  structure  in  figure  4).  The 
first  three  sets  of  molecules  are  optimized  with  respect  to  their  static  isotropic 
hyperpolarizability  /50  (equation  23),  while  the  remaining  sets  are  optimized  with  respect  to 
the  component  of  the  hyperpolarizability  in  direction  of  the  dipole  3tl  (equation  23). 

3.1  Framework  A 

Validation  of  the  algorithm  was  performed  on  the  structure  framework  A  in  figure  5. 

Figure  6  shows  the  progress  of  the  algorithm.  There  are  200  molecules  in  this  library,  but 
hyperpolarizabilities  of  only  24  different  molecules  were  computed  during  the  optimization, 
the  minimum  number  of  molecules  required  for  the  algorithm  to  finish  the  optimization. 
Regardless  of  the  starting  structure,  the  algorithm  consistently  finishes  with  the  global 
hyperpolarizability  optimum  (figure  6),  which  has  also  been  confirmed  experimentally  (26). 
For  comparison,  if  the  library  is  searched  randomly,  the  expected  number  of  computed 
molecules  before  finding  the  global  minimum  is  200  molecules.  If  repeats  are  avoided,  then 
still  101  molecules  would  need  to  be  computed  on  average  in  order  to  obtain  the  same 
result. 
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Steps  4-8,  site  1 
+4  molecules 


0o  =  7.83  x  10  30  esu 


< 

0o  =  46.43  x  10-30  esu 


repeat  cycle,  steps  2-10 
+11  molecules 


Figure  6.  Progress  of  the  optimization  algorithm.  The  steps  refer  to  the  steps  in  figure  3.  The  number 
of  molecules  indicated  is  the  number  of  previously  unvisited  molecules  for  which  the  property 
is  computed  in  performing  the  steps.  Carbons  are  marked  in  orange,  hydrogens  in  white, 
oxygens  in  red,  and  nitrogens  in  light  blue. 


3.2  Framework  B 


The  static  hyperpolarizability  0q  of  framework  B  in  figure  5  optimizes  to  an  unstable, 
perhaps  explosive,  structure  with  mostly  nitro-  and  amino-substituents  (figure  7).  The  final 
computed  /^-value  was  131.9  x  10-30  esu  after  121  computed  structures  from  68  ~  1.7  x  106 
possible  molecules.  Additionally,  conformational  analysis  was  performed.  CHO  and  OH 
were  allowed  two  possible  orientations  in  the  plane  of  the  tolane.  For  CH2OH  and  CH2NH2, 
three-fold  rotation  around  the  C-0  and  C-N  bonds,  respectively,  was  included,  while  only 
two-fold  rotations  around  the  bonds  connecting  to  the  tolane  framework  were  allowed. 
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Figure  7.  Final  structure  of  framework  B.  Carbons  are  marked  in  orange,  hydrogens  in  white,  oxygens 
in  red,  and  nitrogens  in  light  blue. 

3.3  Framework  C-l 

The  static  hyperpolarizability  for  compounds  in  C-l  of  figure  5  was  optimized  starting  from 
three  different  initial  structures.  A  total  of  78  ~  5.8  x  106  possible  molecules  exist  in  this 
family.  Conformational  considerations  were  treated  as  in  framework  B.  Two  of  the  three 
runs  converged  to  the  same  structure  (/%  =  214.6  x  10_3Oesu),  while  the  third  converged  to 
a  second  structure  with  comparable  hyperpolarizability  (/%  =  216.9  x  10_3Oesu,  see  tables  1 
and  2).  All  three  runs  finished  after  computing  less  than  0.1%  of  all  possible  molecules  and 
achieved  three-  to  four-fold  improvements  of  the  hyperpolarizability.  Comparing  the  two 
structures,  some  common  motifs  emerge:  the  variable  fragments  X2  and  X3  contain 
nitro-groups,  while  X5  and  X9  are  occupied  by  amino-groups;  furthermore,  positions  B  and 
C  are  occupied  by  electron  acceptors  and  sites  A  and  D  are  occupied  by  electron  donors.  It 
is  notable  that  not  all  positions  are  occupied  by  the  “strongest”  donors  or  acceptors  in  the 
substitution  set,  i.e.,  NH2  and  NO2,  respectively. 

3.4  Framework  C-2 

Halogen  substituents  do  not  necessitate  extensive  conformational  analysis,  so  they  allow 
the  evaluation  of  the  optimization  method  without  added  constraints.  The  structures  C-2 
in  figure  5  were  optimized  for  the  hyperpolarizability  in  the  direction  of  the  dipole  moment 
(/3ft,  see  equation  23).  Entries  (a)  and  (c)  in  table  3  show  the  results  of  two  optimizations 
of  framework  C-2  in  figure  5  starting  from  the  same  initial  structure  with  all  substitutions 
set  to  hydrogens.  In  this  case,  convergence  to  a  hyperpolarizability  maximum  is  confirmed 
to  be  logarithmic  in  the  library  size,  i.e.,  squaring  the  library  size  from  256  to  65536  leads 
to  roughly  twice  the  number  of  computed  molecules. 


Table  1.  Starting  and  final  structures  of  framework  C-l  of  figure  5.  Carbons  are  marked 
in  orange,  hydrogens  in  white,  oxygens  in  red,  and  nitrogens  in  light  blue. 


Table  2.  Starting  and  final  hyperpolarizabilities  and  number  of  computed  molecules  for 
framework  C-l  in  figure  5. 


Run 

Initial  /?o/10  30esu 

Final  ;3o/10  30esu 

Molecules 

Computed 

1 

55.1 

214.6 

157 

2 

71.0 

214.6 

109 

3 

49.9 

216.9 

169 

12 


Table  3.  Optimized  structures  for  frameworks  C-2  in  figure  5. 


Compound 

/?M/10  30 esu 

Molecules 

Computed 

Library  size 

(a) 

A,D,X6,X9=H;  X2,X3=Br; 

B=C1;  C=F 

84.1 

67 

65536 

(b) 

A,B,C,D,X6,X9=H;  X2,X3=Br 

77.4 

69 

65536 

(c) 

A, D=Br;  X6,X9=H;  X2,X3=Br; 

B, C=F 

83.5 

28 

256 

(d) 

A, D=H;  X6,X9=H;  X2,X3=Br; 

B, C=Br 

83.2 

28 

256 

Figure  8.  Largest  d/7  structure  for  framework  C-2  in  figure  5.  Carbons  are  marked  in 
orange,  hydrogens  in  white,  oxygens  in  red,  nitrogens  in  light  blue,  bromine 
in  dark  red,  fluorine  in  dark  blue,  and  chlorine  in  purple. 

The  stability  of  the  optimization  procedure  was  tested  by  constraining  substitutions  to  be 
symmetric  with  respect  to  the  mirror  plane  perpendicular  to  the  plane  of  the  backbone 
(runs  (c)  and  (d)  in  table  3),  as  well  as  starting  from  different  initial  structures:  runs  (a) 
and  (c)  were  started  with  all  substituents  set  to  hydrogen,  while  run  (b)  starts  from 
X2  =Br  and  X8=F,  and  run  (d)  starts  from  X2  —  X3  =Br  and  X7  =  Xg  =F.  The 
hyperpolarizabilities  of  the  initial  structures  were  within  4  units  of  50  x  10~30  esu.  Since 
the  procedure  is  not  a  global  optimization  algorithm,  it  is  possible  to  end  at  different  local 
maxima,  here  each  run  ended  in  a  different  structure  with  corresponding 
hyperpolarizabilities  (/3M/ 1CT30  esu  =  84.1,  77.4,  83.5,  83.2,  respectively,  see  table  3). 
Nonetheless,  the  optimizations  lead  to  significant  and  comparable  improvements  between 
runs.  The  found  maxima  all  place  bromine  in  the  X2  and  X3  positions,  implying  that  a 
large  fraction  of  the  gain  in  arises  from  bromine  to  amino  charge  transfer  interactions. 
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3.5  Framework  C-3 


Combining  parts  of  libraries  of  C-l  and  C-2  in  figure  5,  structures  C-3  in  figure  5  were 
subjected  to  optimization  of  the  static  hyperpolarizability  in  the  direction  of  the  dipole 
moment  Four  optimizations  from  different  starting  configurations  were  performed 
(see  tables  4  and  5  for  results).  The  “unbiased”  first  optimization  leads  to  a  five- fold 
increase  in  (3^  (37.0  — *  181.5  x  10~30esu).  The  final  structure  (see  table  4)  indeed  is  a 
mixture  of  the  results  for  C-l  and  C-2  in  figure  5.  The  second  optimization  was  started 
with  a  structure  concentrating  equal  numbers  of  donors  on  one  side  and  acceptors  on  the 
other,  analogous  to  the  final  structure  of  framework  B  in  figure  5.  This  starting  structure 
exhibited  only  a  marginally  larger  hyperpolarizability  (55.6  x  10-30esu)  than  the 
“unbiased”  starting  structure,  but  optimized  to  an  alternating  donor-acceptor  arrangement 
(171.6  x  10~30esu)  that  failed  to  reach  the  optimum  found  in  the  first  optimization.  The 
low  hyperpolarizability  is  presumably  due  to  the  benzene  rings  twisting  out  of  plane  and 
reducing  conjugation. 

A  biased  starting  point,  with  alternating  donor  and  acceptor  groups,  leads  to  a  marginally 
increased  final  hyperpolarizability  (191.6  x  10-30)  over  the  first  optimization.  The  attempt 
to  exceed  this  value  by  substituting  the  “strongest”  electron  donors  and  acceptors,  NH2 
and  NO2,  fails  despite  the  fact  that  this  structure  is  indeed  a  local  maximum 
(173.3  x  10_30esu).  All  four  optimization  runs  finish  compute  less  than  0.001%  out  of  the 
possible  98  ~  4.3  x  107  molecules. 


14 


Table  4.  Starting  and  final  structures  of  framework  C-3  in  figure  5.  Carbons  are  marked 
in  orange,  hydrogens  in  white,  oxygens  in  red,  nitrogens  in  light  blue,  bromine 
in  dark  red,  fluorine  in  dark  blue,  and  chlorine  in  purple. 


Table  5.  Starting  and  final  hyperpolarizabilities  and  number  of  computed  molecules  for 
framework  C-3  in  figure  5. 


Run 

Initial  /J^/10  30esu 

Final  /3M/10  30esu 

#  Comp. 

1 

37.0 

181.5 

181 

2 

55.6 

171.6 

153 

3 

139.8 

191.6 

161 

4 

173.3 

173.3 

65 

15 


4.  Summary  and  Conclusions 


We  have  introduced  an  embedding  of  discrete  molecular  spaces  in  a  continuous  space, 
similar  to  the  embedding  of  discrete  Hamiltonians  in  LCAP  (27).  From  this  embedding,  an 
optimization  based  on  differentiation  in  the  continuous  space  was  developed.  The 
theoretical  framework  transforms  a  discrete  optimization  problem  into  a  continuous 
optimization  problem,  which  then  gives  rise  to  a  discrete  optimization  strategy.  The 
theoretical  complexity  of  the  used  line-search  algorithm  is  0(log  N)  in  the  library  size  N 
and  applications  of  the  algorithm  to  a  variety  of  conditions  confirm  the  method’s 
effectiveness.  A  design  strategy  for  tolanes  of  alternating  donors  and  acceptors  along  a 
conjugated  framework  is  suggested  by  the  optimization  results.  Further  applications  and 
improvements  are  under  study  including  an  extension  to  second-order  derivative  methods, 
probabilistic  methods  (28),  and  dynamic  ordering  of  the  parameters  to  achieve  overall 
convexity. 
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List  of  Symbols,  Abbreviations,  and  Acronyms 


AMI 

Austin  Model  1 

ARL 

U.S.  Army  Research  Laboratory 

ARO 

Army  Research  Office 

Cl 

configuration  interaction 

CNDO 

Complete  Neglect  of  Differential  Overlap 

DARPA 

Defense  Advanced  Research  Projects  Agency 

DEE 

dead-end-elimination  algorithms 

INDO/S 

Intermediate  Neglect  of  Differential  Overlap/Screened  Approximation 

LCAP 

linear  combination  of  atomic  potentials 
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NO.  OF 

COPIES  ORGANIZATION 

1  ADMNSTR 
ELEC  DEFNS  TECHL  INFO  CTR 
ATTN  DTIC  OCP 

8725  JOHN  J  KINGMAN  RD  STE  0944 
FT  BELVOIR  VA  22060-6218 

1  HC  DARPA 

ATTN  IXO  S  WELBY 
3701  N  FAIRFAX  DR 
ARLINGTON  VA  22203-1714 

1  CD  OFC  OF  THE  SECY  OF  DEFNS 
ATTN  ODDRE  (R&AT) 

THE  PENTAGON 
WASHINGTON  DC  20301-3080 

1  HC  US  ARMY  RSRCH  DEV  AND 
ENGRG  CMND 

ARMAMENT  RSRCH  DEV  AND 
ENGRG  CTR 

ARMAMENT  ENGRG  AND 
TECHNLGY  CTR 

ATTN  AMSRD  AAR  AEF  T  J  MATTS 
BLDG  305 

ABERDEEN  PROVING  GROUND  MD 
21005-5001 

1  HC  PM  TIMS,  PROFILER  (MMS-P) 
AN/TMQ-52 
ATTN  B  GRIFFIES 
BUILDING  563 
FT  MONMOUTH  NJ  07703 

1  HC  US  ARMY  INFO  SYS  ENGRG  CMND 
ATTN  AMSEL  IE  TD  F  JENIA 
FT  HUACHUCA  AZ  85613-5300 

1  HC  COMMANDER 

US  ARMY  RDECOM 
ATTN  AMSRD  AMR 
WC  MCCORKLE 
5400  FOWLER  RD 

REDSTONE  ARSENAL  AL  35898-5000 

1  HC  US  GOVERNMENT  PRINT  OFF 

DEPOSITORY  RECEIVING  SECTION 
ATTN  MAIL  STOP  I  DAD  J  TATE 
732  NORTH  CAPITOL  ST  NW 
WASHINGTON  DC  20402 


NO.  OF 

COPIES  ORGANIZATION 

2  HCS  DUKE  UNIVERSITY 
DEPT  OF  CHEMISTRY 
ATTN  D  BERATAN 
ATTN  W  YANG 
124  SCIENCE  DR  BOX  90354 
DURHAM  NC  27708-0354 

1  HC  US  ARMY  RSRCH  LAB 

ATTN  AMSRD  ARL  WM  MA 
A  RAWLETT 
BLDG  4600RMC213 
ABERDEEN  PROVING  GROUND  MD 
21005 

1  HC  US  ARMY  RSRCH  LAB 

ATTN  AMSRD  ARL  WM  MA 
C  RINDERSPACHER 
BLDG  4600  RM  C228 
ABERDEEN  PROVING  GROUND  MD 
21005 

1  HC  US  ARMY  RSRCH  LAB 

ATTN  AMSRD  ARL  WM  MA 
J  ANDZELM 
BLDG  4600  RM  C204 
ABERDEEN  PROVING  GROUND  MD 
21005 

1  HC  US  ARMY  RSRCH  LAB 

ATTN  AMSRD  ARL  WM  MA 
J  DOUGHERTY 
BLDG  4600  RM  C227 
ABERDEEN  PROVING  GROUND  MD 
21005 

1  HC  US  ARMY  RSRCH  LA  B 

ATTN  AMSRD  ARL  Cl  OK  TP 
T  LANDFRIED 
BLDG  4600 

ABERDEEN  PROVING  GROUND  MD 
21005-5066 

1  HC  DIRECTOR 

US  ARMY  RSRCH  LAB 
ATTN  AMSRD  ARL  RO  EV 
W  D  BACH 
PO  BOX  12211 

RESEARCH  TRIANGLE  PARK  NC 
27709 
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NO.  OF 

COPIES  ORGANIZATION 

3  HCS  US  ARMY  RSRCH  LAB 

ATTN  AMSRD  ARL  Cl  OK  PE 
TECHL  PUB 

ATTN  AMSRD  ARL  Cl  OK  TL 
TECHL  LIB 

ATTN  IMNE  ALC  HRR 
MAIL  &  RECORDS  MGMT 
ADELPHI  MD  20783-1197 

TOTAL:  19  (1  ELEC,  1  CD,  17  HCS) 


Intentionally  left  blank. 
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